Phurti: Application and Network-aware Flow Scheduling for Mapreduce

نویسنده

  • Indranil Gupta
چکیده

Traffic for a typical MapReduce job in a datacenter consists of multiple network flows. Traditionally, network resources have been allocated to optimize network-level metrics such as flow completion time or throughput. Some recent schemes propose using application-aware scheduling which can reduce the average job completion time. However, most of them treat the core network as a black box with sufficient capacity. Even if only one network link in the core network becomes a bottleneck, it can hurt application performance. We design and implement a centralized flow scheduling framework called Phurti with the goal of decreasing the completion time for Hadoop MapReduce jobs. Phurti communicates both with the Hadoop framework to retrieve job-level network traffic information and the OpenFlow-based switches to learn about network topology. Phurti implements a novel heuristic called Smallest Maximum Sequential-traffic First (SMSF) that uses collected application and network information to perform traffic scheduling for MapReduce jobs. Our evaluation with real Hadoop workloads shows that compared to application and network-agnostic scheduling strategies, Phurti improves job completion time for 95% of the jobs, decreases average job completion time by 20% and tail job completion time by 13%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Communication-Aware Traffic Stream Optimization for Virtual Machine Placement in Cloud Datacenters with VL2 Topology

By pervasiveness of cloud computing, a colossal amount of applications from gigantic organizations increasingly tend to rely on cloud services. These demands caused a great number of applications in form of couple of virtual machines (VMs) requests to be executed on data centers’ servers. Some of applications are as big as not possible to be processed upon a single VM. Also, there exists severa...

متن کامل

A JOINT DUTY CYCLE SCHEDULING AND ENERGY AWARE ROUTING APPROACH BASED ON EVOLUTIONARY GAME FOR WIRELESS SENSOR NETWORKS

Network throughput and energy conservation are two conflicting important performance metrics for wireless sensor networks. Since these two objectives are in conflict with each other, it is difficult to achieve them simultaneously. In this paper, a joint duty cycle scheduling and energy aware routing approach is proposed based on evolutionary game theory which is called DREG. Making a trade-off ...

متن کامل

Availability and Network-Aware MapReduce Task Scheduling over the Internet

MapReduce offers an ease-of-use programming paradigm for processing large datasets. In our previous work, we have designed a MapReduce framework called BitDew-MapReduce for desktop grid and volunteer computing environment, that allows nonexpert users to run data-intensive MapReduce jobs on top of volunteer resources over the Internet. However, network distance and resource availability have gre...

متن کامل

Data-Aware Scheduling in Datacenters

Title of dissertation: DATA-AWARE SCHEDULING IN DATACENTERS Manish Purohit, Doctor of Philosophy, 2016 Dissertation directed by: Professor Samir Khuller Department of Computer Science Datacenters have emerged as the dominant form of computing infrastructure over the last two decades. The tremendous increase in the requirements of data analysis has led to a proportional increase in power consump...

متن کامل

An Efficient Extension of Network Simplex Algorithm

In this paper, an efficient extension of network simplex algorithm is presented. In static scheduling problem, where there is no change in situation, the challenge is that the large problems can be solved in a short time. In this paper, the Static Scheduling problem of Automated Guided Vehicles in container terminal is solved by Network Simplex Algorithm (NSA) and NSA+, which extended the stand...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015